Skip to content

[Configurator] Add GymnasiumAdapter for CloudAI envs#894

Open
rutayan-nv wants to merge 2 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/gymnasium-adapter
Open

[Configurator] Add GymnasiumAdapter for CloudAI envs#894
rutayan-nv wants to merge 2 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/gymnasium-adapter

Conversation

@rutayan-nv
Copy link
Copy Markdown
Contributor

  • GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env: Dict of Discrete actions over the tunable params, float32 Box observation derived from define_observation_space(), gymnasium 5-tuple from step / step_raw.
  • Fixed (single-value) params are injected on every step so agents only ever see the tunable subset; step_raw bypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict.
  • The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths match whether the env is driven by the DSE loop or by an external training loop wrapping the adapter.
  • CloudAIGymEnv.define_observation_space() now returns one slot per agent metric (at least one), giving adapters the correct Box shape.
  • gymnasium is an optional dependency: lazy-imported inside the adapter and exposed via a new rl extra (also added to dev) pinned to ~=1.2.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds GymnasiumAdapter wrapping BaseGym to expose Gymnasium-style action/observation spaces and APIs; makes CloudAIGym observation size metric-driven; exports the adapter; adds optional gymnasium dev/rl extras and a comprehensive pytest suite for the adapter.

Changes

Gymnasium adapter integration

Layer / File(s) Summary
CloudAIGymEnv observation-space contract
src/cloudai/configurator/cloudai_gym.py
define_observation_space() returns a float list sized to max(len(agent_metrics), 1) and reset() uses it to produce initial observations instead of a fixed placeholder.
GymnasiumAdapter implementation
src/cloudai/configurator/gymnasium_adapter.py
New adapter with lazy import guard for gymnasium/numpy; builds action_space as spaces.Dict of spaces.Discrete for tunable params (excluding single-value fixed params), observation_space as spaces.Box of float32 shaped from env.define_observation_space(); implements decode_action, reset, step, step_raw, render, key validation, test-run step synchronization, and observation conversion.
Module exports and dependencies
pyproject.toml, src/cloudai/configurator/__init__.py
Adds gymnasium~=1.2 to dev and an rl extra depending on it; exports GymnasiumAdapter from the configurator package.
GymnasiumAdapter test suite
tests/test_gymnasium_adapter.py
Deterministic fake environments and fixtures validate action/observation spaces, reset/step outputs and dtypes, action decoding, fixed-parameter injection, step_raw behavior, test_run.step synchronization, and missing-dependency/input-validation error messages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I wrapped the gym with careful paws,
Discrete choices hop, fixed ones stay small.
Steps count from one, observations align,
Float32 arrays march in tidy rows.
A rabbit applauds this adapter call.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding GymnasiumAdapter for CloudAI environments.
Description check ✅ Passed The description is directly related to the changeset, detailing the GymnasiumAdapter implementation, fixed parameters handling, step numbering, observation space changes, and optional dependency setup.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

This was referenced May 15, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 92-112: decode_action and step_raw currently allow partial or
unexpected parameter sets which can silently produce incorrect runs; update
decode_action, step, and step_raw to validate that the final parameter keys
exactly match the expected set (the union of self._tunable_params.keys() and
self._fixed_params.keys()) before calling _step_with_params, and raise a clear
exception (e.g., ValueError) listing missing or unexpected keys if validation
fails so the caller fails fast; enforce the same check when step builds params
from decode_action and when step_raw is passed params directly, and reference
the symbols decode_action, step, step_raw, self._tunable_params,
self._fixed_params, and _step_with_params to locate the changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 9474faf6-0f97-42f8-82a0-1e23932471bc

📥 Commits

Reviewing files that changed from the base of the PR and between 4bdc465 and 71ddd51.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • tests/test_gymnasium_adapter.py

Comment thread src/cloudai/configurator/gymnasium_adapter.py
@rutayan-nv rutayan-nv changed the title feat(configurator): add GymnasiumAdapter for CloudAI envs [Configurator] Add GymnasiumAdapter for CloudAI envs May 15, 2026
- GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env:
  Dict of Discrete actions over the tunable params, float32 Box observation
  derived from define_observation_space(), gymnasium 5-tuple from step/step_raw.
- Fixed (single-value) params are injected on every step so agents only ever
  see the tunable subset; step_raw bypasses index decoding and fixed injection
  for callers that already have a fully-formed parameter dict.
- The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths
  match whether the env is driven by the DSE loop or an external training loop.
- CloudAIGymEnv.define_observation_space() now returns one slot per agent
  metric (at least one), giving adapters the correct Box shape.
- gymnasium is an optional dependency: lazy-imported inside the adapter and
  exposed via a new rl extra (also added to dev) pinned to ~=1.2.
decode_action used to accept partial action dicts and step_raw accepted
arbitrary key sets, both silently propagating incomplete params to the
underlying env. Validate up front:

- decode_action requires keys exactly == self._tunable_params, and each
  discrete index must be in range.
- step_raw requires keys exactly == self._tunable_params | self._fixed_params.

A new _assert_keys helper raises ValueError with sorted missing / extra lists
so callers see exactly what was wrong. Adds tests for missing key, unknown
key, out-of-range index, missing fixed param, and unknown raw key paths.
@rutayan-nv rutayan-nv force-pushed the rpatro/gymnasium-adapter branch from 9937ef3 to 78b1bb3 Compare May 18, 2026 16:33
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 79-82: The constraint-failure branch returns a hardcoded
single-element observation that violates the declared observation shape; update
the branch to return the same-width list as the normal case by using the
metric-derived width (use max(len(self.test_run.test.agent_metrics), 1)) instead
of a single 0.0 so that both the regular return (currently using [0.0] *
max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch
produce identically shaped observations.

In `@tests/test_gymnasium_adapter.py`:
- Around line 209-215: The tests assert on ValueError messages using exact list
formatting (e.g., match=r"missing=\['param_b'\]"), which is brittle; update the
assertions in test_decode_action_missing_key and
test_decode_action_rejects_unknown_keys to use more focused regexes that only
check for the presence of the key names (e.g., look for "param_b" and "bogus")
or words like "missing" / "extra" rather than exact list punctuation; adjust the
two places mentioned (the current block around adapter.decode_action({"param_a":
0}) and the block for adapter.decode_action({"param_a": 0, "param_b": 1,
"bogus": 0}) and the similar assertions at lines 226-234) to use these looser
key-focused regexes so tests validate behavior without depending on formatting.
- Around line 131-136: Remove the test's assertion on the private attribute
adapter._fixed_params and instead verify public behavior: keep the assertion
that adapter.action_space.spaces equals {"param_a","param_b"}, then call
GymnasiumAdapter.step (using the _FixedParamGym instance which records received
actions) with an action containing only "param_a"/"param_b" and assert the
environment received/processed an action that includes the fixed parameter (e.g.
the recorded last_action or returned observation/info shows "fixed_param": 42);
this exercises action-space exclusion plus the merged action behavior without
relying on the private _fixed_params field.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: bf1b3854-aa01-4f65-8aeb-7699901ac890

📥 Commits

Reviewing files that changed from the base of the PR and between 9937ef3 and 78b1bb3.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • tests/test_gymnasium_adapter.py

Comment on lines +79 to +82
list: One float slot per agent metric (at least one), giving the correct shape
for adapters that derive ``gymnasium.spaces.Box`` from this output.
"""
return [0.0]
return [0.0] * max(len(self.test_run.test.agent_metrics), 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return constraint-failure observations with the declared shape.

Line 82 makes observation width metric-driven, but the constraint-failure branch still returns a hardcoded single-element observation (Line 143). That breaks the fixed observation-shape contract expected by Gymnasium consumers.

Suggested fix
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            fallback_obs = [-1.0] * len(self.define_observation_space())
+            return fallback_obs, self.rewards.constraint_failure, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 79 - 82, The
constraint-failure branch returns a hardcoded single-element observation that
violates the declared observation shape; update the branch to return the
same-width list as the normal case by using the metric-derived width (use
max(len(self.test_run.test.agent_metrics), 1)) instead of a single 0.0 so that
both the regular return (currently using [0.0] *
max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch
produce identically shaped observations.

Comment on lines +131 to +136
def test_single_value_params_are_excluded_from_action_space() -> None:
adapter = GymnasiumAdapter(_FixedParamGym())

assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
assert adapter._fixed_params == {"fixed_param": 42}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Avoid asserting private adapter internals in tests.

Line 135 couples the test to adapter._fixed_params, which is implementation detail. Assert only public behavior (action-space exclusion + merged action on step) to keep tests stable across refactors.

Suggested diff
 def test_single_value_params_are_excluded_from_action_space() -> None:
     adapter = GymnasiumAdapter(_FixedParamGym())

     assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
-    assert adapter._fixed_params == {"fixed_param": 42}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_single_value_params_are_excluded_from_action_space() -> None:
adapter = GymnasiumAdapter(_FixedParamGym())
assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
assert adapter._fixed_params == {"fixed_param": 42}
def test_single_value_params_are_excluded_from_action_space() -> None:
adapter = GymnasiumAdapter(_FixedParamGym())
assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_gymnasium_adapter.py` around lines 131 - 136, Remove the test's
assertion on the private attribute adapter._fixed_params and instead verify
public behavior: keep the assertion that adapter.action_space.spaces equals
{"param_a","param_b"}, then call GymnasiumAdapter.step (using the _FixedParamGym
instance which records received actions) with an action containing only
"param_a"/"param_b" and assert the environment received/processed an action that
includes the fixed parameter (e.g. the recorded last_action or returned
observation/info shows "fixed_param": 42); this exercises action-space exclusion
plus the merged action behavior without relying on the private _fixed_params
field.

Comment on lines +209 to +215
with pytest.raises(ValueError, match=r"missing=\['param_b'\]"):
adapter.decode_action({"param_a": 0})


def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None:
with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Relax exception-message matching to avoid brittle formatting dependencies.

These assertions currently depend on exact list representation/order in error strings. Prefer key-focused regex so tests validate behavior, not message formatting details.

Suggested diff
 def test_decode_action_rejects_missing_keys(adapter: GymnasiumAdapter) -> None:
-    with pytest.raises(ValueError, match=r"missing=\['param_b'\]"):
+    with pytest.raises(ValueError, match=r"missing=.*param_b"):
         adapter.decode_action({"param_a": 0})

 def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None:
-    with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
+    with pytest.raises(ValueError, match=r"extra=.*bogus"):
         adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0})
@@
 def test_step_raw_rejects_missing_fixed_param() -> None:
@@
-    with pytest.raises(ValueError, match=r"missing=\['fixed_param'\]"):
+    with pytest.raises(ValueError, match=r"missing=.*fixed_param"):
         adapter.step_raw({"param_a": 1, "param_b": 10})
@@
 def test_step_raw_rejects_unknown_keys() -> None:
@@
-    with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
+    with pytest.raises(ValueError, match=r"extra=.*bogus"):
         adapter.step_raw({"param_a": 1, "param_b": 10, "fixed_param": 42, "bogus": 0})

Also applies to: 226-234

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_gymnasium_adapter.py` around lines 209 - 215, The tests assert on
ValueError messages using exact list formatting (e.g.,
match=r"missing=\['param_b'\]"), which is brittle; update the assertions in
test_decode_action_missing_key and test_decode_action_rejects_unknown_keys to
use more focused regexes that only check for the presence of the key names
(e.g., look for "param_b" and "bogus") or words like "missing" / "extra" rather
than exact list punctuation; adjust the two places mentioned (the current block
around adapter.decode_action({"param_a": 0}) and the block for
adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0}) and the similar
assertions at lines 226-234) to use these looser key-focused regexes so tests
validate behavior without depending on formatting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant